| Number of Variables | 20 |
|---|---|
| Number of Rows | 59591 |
| Missing Cells | 237020 |
| Missing Cells (%) | 19.9% |
| Duplicate Rows | 8041 |
| Duplicate Rows (%) | 13.5% |
| Total Size in Memory | 47.1 MB |
| Average Row Size in Memory | 829.2 B |
| Variable Types |
|
| furniture has 42887 (71.97%) missing values | Missing |
|---|---|
| floornumber has 58530 (98.22%) missing values | Missing |
| direction has 44541 (74.74%) missing values | Missing |
| bedroom has 754 (1.27%) missing values | Missing |
| bathroom has 1089 (1.83%) missing values | Missing |
| facade has 44249 (74.25%) missing values | Missing |
| street_size has 44464 (74.62%) missing values | Missing |
| price_per_square is skewed | Skewed |
| floornumber is skewed | Skewed |
| area is skewed | Skewed |
| price is skewed | Skewed |
|---|---|
| lat is skewed | Skewed |
| lng is skewed | Skewed |
| bedroom is skewed | Skewed |
| bathroom is skewed | Skewed |
| floors is skewed | Skewed |
| facade is skewed | Skewed |
| street_size is skewed | Skewed |
| num_people is skewed | Skewed |
| area(m2) is skewed | Skewed |
| density(people/m2) is skewed | Skewed |
|---|---|
| Dataset has 8041 (13.49%) duplicate rows | Duplicates |
| address has a high cardinality: 13922 distinct values | High Cardinality |
categorical
| Approximate Distinct Count | 30 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 6.6 MB |
| Mean | 8.6656 |
|---|---|
| Standard Deviation | 1.6808 |
| Median | 9 |
| Minimum | 5 |
| Maximum | 12 |
| 1st row | Ba Vì |
|---|---|
| 2nd row | Ba Vì |
| 3rd row | Ba Vì |
| 4th row | Ba Vì |
| 5th row | Ba Vì |
| Count | 330688 |
|---|---|
| Lowercase Letter | 222328 |
| Space Separator | 70902 |
| Uppercase Letter | 108360 |
| Dash Punctuation | 0 |
| Decimal Number | 0 |
numerical
| Approximate Distinct Count | 9616 |
|---|---|
| Approximate Unique (%) | 16.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 103.6491 |
| Minimum | 0.01 |
| Maximum | 1140 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0.01 |
|---|---|
| 5-th Percentile | 17.57 |
| Q1 | 40.51 |
| Median | 88.37 |
| Q3 | 124 |
| 95-th Percentile | 268.675 |
| Maximum | 1140 |
| Range | 1139.99 |
| IQR | 83.49 |
| Mean | 103.6491 |
|---|---|
| Standard Deviation | 93.3681 |
| Variance | 8717.5964 |
| Sum | 6.1766×1006 |
| Skewness | 3.0925 |
| Kurtosis | 15.7424 |
| Coefficient of Variation | 0.9008 |
categorical
| Approximate Distinct Count | 13922 |
|---|---|
| Approximate Unique (%) | 23.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 16.1 MB |
| Mean | 57.4129 |
|---|---|
| Standard Deviation | 10.7256 |
| Median | 57 |
| Minimum | 14 |
| Maximum | 163 |
| 1st row | Xã Tản Lĩnh, Huyện... |
|---|---|
| 2nd row | Thôn Nghe Xã Vân H... |
| 3rd row | Xã Yên Bài, Ba Vì,... |
| 4th row | Xã Yên Bài, Ba Vì,... |
| 5th row | Xã Yên Bài, Ba Vì,... |
| Count | 1755692 |
|---|---|
| Lowercase Letter | 1167319 |
| Space Separator | 645951 |
| Uppercase Letter | 588373 |
| Dash Punctuation | 734 |
| Decimal Number | 27159 |
categorical
| Approximate Distinct Count | 4 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 42887 |
| Missing (%) | 72.0% |
| Memory Size | 2.3 MB |
| Mean | 15.6413 |
|---|---|
| Standard Deviation | 0.882 |
| Median | 15 |
| Minimum | 12 |
| Maximum | 17 |
| 1st row | Nội thất cao cấp |
|---|---|
| 2nd row | Nội thất đầy đủ |
| 3rd row | Bàn giao thô |
| 4th row | Nội thất cao cấp |
| 5th row | Nội thất đầy đủ |
| Count | 133204 |
|---|---|
| Lowercase Letter | 116500 |
| Space Separator | 49890 |
| Uppercase Letter | 16704 |
| Dash Punctuation | 0 |
| Decimal Number | 0 |
numerical
| Approximate Distinct Count | 45 |
|---|---|
| Approximate Unique (%) | 4.2% |
| Missing | 58530 |
| Missing (%) | 98.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 16.6 KB |
| Mean | 13.1122 |
| Minimum | 1 |
| Maximum | 952 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 2 |
| Q1 | 5 |
| Median | 10 |
| Q3 | 18 |
| 95-th Percentile | 30 |
| Maximum | 952 |
| Range | 951 |
| IQR | 13 |
| Mean | 13.1122 |
|---|---|
| Standard Deviation | 30.2178 |
| Variance | 913.1167 |
| Sum | 13912 |
| Skewness | 28.3292 |
| Kurtosis | 877.0718 |
| Coefficient of Variation | 2.3046 |
numerical
| Approximate Distinct Count | 1268 |
|---|---|
| Approximate Unique (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 63.0648 |
| Minimum | 1 |
| Maximum | 500 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 30 |
| Q1 | 39 |
| Median | 51 |
| Q3 | 73 |
| 95-th Percentile | 133 |
| Maximum | 500 |
| Range | 499 |
| IQR | 34 |
| Mean | 63.0648 |
|---|---|
| Standard Deviation | 40.6975 |
| Variance | 1656.2844 |
| Sum | 3.7581e+06 |
| Skewness | 3.336 |
| Kurtosis | 17.5375 |
| Coefficient of Variation | 0.6453 |
categorical
| Approximate Distinct Count | 8 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 44541 |
| Missing (%) | 74.7% |
| Memory Size | 1.5 MB |
| Mean | 6.3045 |
|---|---|
| Standard Deviation | 2.0407 |
| Median | 7 |
| Minimum | 3 |
| Maximum | 8 |
| 1st row | Đông Bắc |
|---|---|
| 2nd row | Đông Nam |
| 3rd row | Đông Nam |
| 4th row | Nam |
| 5th row | Nam |
| Count | 58552 |
|---|---|
| Lowercase Letter | 39939 |
| Space Separator | 10659 |
| Uppercase Letter | 18613 |
| Dash Punctuation | 0 |
| Decimal Number | 0 |
numerical
| Approximate Distinct Count | 2970 |
|---|---|
| Approximate Unique (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 6.9781 |
| Minimum | 0.001 |
| Maximum | 282 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0.001 |
|---|---|
| 5-th Percentile | 1 |
| Q1 | 2.4 |
| Median | 3.8 |
| Q3 | 6.6 |
| 95-th Percentile | 24 |
| Maximum | 282 |
| Range | 281.999 |
| IQR | 4.2 |
| Mean | 6.9781 |
|---|---|
| Standard Deviation | 12.053 |
| Variance | 145.2757 |
| Sum | 415831.8907 |
| Skewness | 8.4315 |
| Kurtosis | 118.516 |
| Coefficient of Variation | 1.7273 |
numerical
| Approximate Distinct Count | 4414 |
|---|---|
| Approximate Unique (%) | 7.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 21.0131 |
| Minimum | 20.6307 |
| Maximum | 21.3165 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 20.6307 |
|---|---|
| 5-th Percentile | 20.9582 |
| Q1 | 20.9869 |
| Median | 21.0114 |
| Q3 | 21.0369 |
| 95-th Percentile | 21.0734 |
| Maximum | 21.3165 |
| Range | 0.6857 |
| IQR | 0.04999 |
| Mean | 21.0131 |
|---|---|
| Standard Deviation | 0.03914 |
| Variance | 0.001532 |
| Sum | 1.2522e+06 |
| Skewness | 0.4647 |
| Kurtosis | 5.0581 |
| Coefficient of Variation | 0.001863 |
numerical
| Approximate Distinct Count | 4593 |
|---|---|
| Approximate Unique (%) | 7.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 105.814 |
| Minimum | 105.3503 |
| Maximum | 107.0768 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 105.3503 |
|---|---|
| 5-th Percentile | 105.7406 |
| Q1 | 105.7856 |
| Median | 105.8112 |
| Q3 | 105.844 |
| 95-th Percentile | 105.9075 |
| Maximum | 107.0768 |
| Range | 1.7266 |
| IQR | 0.0584 |
| Mean | 105.814 |
|---|---|
| Standard Deviation | 0.05655 |
| Variance | 0.003198 |
| Sum | 6.3056e+06 |
| Skewness | -0.3427 |
| Kurtosis | 23.6917 |
| Coefficient of Variation | 0.00053446 |
numerical
| Approximate Distinct Count | 46 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 754 |
| Missing (%) | 1.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 919.3 KB |
| Mean | 3.6087 |
| Minimum | 1 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 1 |
| Q1 | 2 |
| Median | 3 |
| Q3 | 4 |
| 95-th Percentile | 6 |
| Maximum | 100 |
| Range | 99 |
| IQR | 2 |
| Mean | 3.6087 |
|---|---|
| Standard Deviation | 2.2601 |
| Variance | 5.1079 |
| Sum | 212326 |
| Skewness | 7.9787 |
| Kurtosis | 163.9514 |
| Coefficient of Variation | 0.6263 |
numerical
| Approximate Distinct Count | 30 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 1089 |
| Missing (%) | 1.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 914.1 KB |
| Mean | 3.2225 |
| Minimum | 1 |
| Maximum | 42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 1 |
| Q1 | 2 |
| Median | 3 |
| Q3 | 4 |
| 95-th Percentile | 5 |
| Maximum | 42 |
| Range | 41 |
| IQR | 2 |
| Mean | 3.2225 |
|---|---|
| Standard Deviation | 1.5296 |
| Variance | 2.3398 |
| Sum | 188523 |
| Skewness | 3.6199 |
| Kurtosis | 55.2181 |
| Coefficient of Variation | 0.4747 |
numerical
| Approximate Distinct Count | 37 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 506 |
| Missing (%) | 0.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 923.2 KB |
| Mean | 3.6489 |
| Minimum | 1 |
| Maximum | 105 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 1 |
| Q1 | 1 |
| Median | 4 |
| Q3 | 5 |
| 95-th Percentile | 6 |
| Maximum | 105 |
| Range | 104 |
| IQR | 4 |
| Mean | 3.6489 |
|---|---|
| Standard Deviation | 2.0748 |
| Variance | 4.3048 |
| Sum | 215597 |
| Skewness | 4.0565 |
| Kurtosis | 134.1045 |
| Coefficient of Variation | 0.5686 |
categorical
| Approximate Distinct Count | 6 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 7.6 MB |
| Mean | 13.2154 |
|---|---|
| Standard Deviation | 2.2271 |
| Median | 13 |
| Minimum | 7 |
| Maximum | 26 |
| 1st row | Bán đất |
|---|---|
| 2nd row | Bán đất |
| 3rd row | Bán đất |
| 4th row | Bán đất |
| 5th row | Bán đất |
| Count | 493889 |
|---|---|
| Lowercase Letter | 431520 |
| Space Separator | 133384 |
| Uppercase Letter | 62369 |
| Dash Punctuation | 2778 |
| Decimal Number | 0 |
numerical
| Approximate Distinct Count | 312 |
|---|---|
| Approximate Unique (%) | 2.0% |
| Missing | 44249 |
| Missing (%) | 74.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 239.7 KB |
| Mean | 5.8113 |
| Minimum | 0 |
| Maximum | 478 |
| Zeros | 26 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 3.5 |
| Q1 | 4 |
| Median | 4.6 |
| Q3 | 6 |
| 95-th Percentile | 11 |
| Maximum | 478 |
| Range | 478 |
| IQR | 2 |
| Mean | 5.8113 |
|---|---|
| Standard Deviation | 9.4876 |
| Variance | 90.0142 |
| Sum | 89157.62 |
| Skewness | 38.1067 |
| Kurtosis | 1750.8874 |
| Coefficient of Variation | 1.6326 |
numerical
| Approximate Distinct Count | 115 |
|---|---|
| Approximate Unique (%) | 0.8% |
| Missing | 44464 |
| Missing (%) | 74.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 236.4 KB |
| Mean | 9.0355 |
| Minimum | 0 |
| Maximum | 386 |
| Zeros | 42 |
| Zeros (%) | 0.1% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 2.2 |
| Q1 | 3 |
| Median | 5 |
| Q3 | 10 |
| 95-th Percentile | 30 |
| Maximum | 386 |
| Range | 386 |
| IQR | 7 |
| Mean | 9.0355 |
|---|---|
| Standard Deviation | 10.615 |
| Variance | 112.6775 |
| Sum | 136680.57 |
| Skewness | 6.4813 |
| Kurtosis | 135.2011 |
| Coefficient of Variation | 1.1748 |
numerical
| Approximate Distinct Count | 30 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 323942.5093 |
| Minimum | 135618 |
| Maximum | 506347 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 135618 |
|---|---|
| 5-th Percentile | 160495 |
| Q1 | 275745 |
| Median | 303586 |
| Q3 | 371606 |
| 95-th Percentile | 506347 |
| Maximum | 506347 |
| Range | 370729 |
| IQR | 95861 |
| Mean | 323942.5093 |
|---|---|
| Standard Deviation | 85836.002 |
| Variance | 7.3678e+09 |
| Sum | 1.9304e+10 |
| Skewness | 0.4111 |
| Kurtosis | 0.2438 |
| Coefficient of Variation | 0.265 |
numerical
| Approximate Distinct Count | 30 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 36.9968 |
| Minimum | 5.3 |
| Maximum | 423 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 5.3 |
|---|---|
| 5-th Percentile | 9.1 |
| Q1 | 10.3 |
| Median | 32.2 |
| Q3 | 49.6 |
| 95-th Percentile | 116.7 |
| Maximum | 423 |
| Range | 417.7 |
| IQR | 39.3 |
| Mean | 36.9968 |
|---|---|
| Standard Deviation | 36.5618 |
| Variance | 1336.7647 |
| Sum | 2.2047e+06 |
| Skewness | 3.4079 |
| Kurtosis | 19.9379 |
| Coefficient of Variation | 0.9882 |
numerical
| Approximate Distinct Count | 30 |
|---|---|
| Approximate Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 931.1 KB |
| Mean | 16447.8807 |
| Minimum | 687 |
| Maximum | 37161 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 687 |
|---|---|
| 5-th Percentile | 2452 |
| Q1 | 7529 |
| Median | 12564 |
| Q3 | 25588 |
| 95-th Percentile | 37161 |
| Maximum | 37161 |
| Range | 36474 |
| IQR | 18059 |
| Mean | 16447.8807 |
|---|---|
| Standard Deviation | 11623.7526 |
| Variance | 1.3511e+08 |
| Sum | 9.8015e+08 |
| Skewness | 0.5066 |
| Kurtosis | -1.275 |
| Coefficient of Variation | 0.7067 |
categorical
| Approximate Distinct Count | 3 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 6.0 MB |
| Mean | 4.1002 |
|---|---|
| Standard Deviation | 0.306 |
| Median | 4 |
| Minimum | 4 |
| Maximum | 6 |
| 1st row | huyện |
|---|---|
| 2nd row | huyện |
| 3rd row | huyện |
| 4th row | huyện |
| 5th row | huyện |
| Count | 184537 |
|---|---|
| Lowercase Letter | 184537 |
| Space Separator | 104 |
| Uppercase Letter | 0 |
| Dash Punctuation | 0 |
| Decimal Number | 0 |